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Abstract 

We introduce a general technique to create an extended formulation 
of a mixed-integer program. We classify the integer variables into blocks, 
each of which generates a finite set of vector values. The extended formu- 
lation is constructed by creating a new binary variable for each generated 
value. Initial experiments show that the extended formulation can have a 
more compact complete description than the original formulation. 

We prove that, using this reformulation technique, the facet descrip- 
tion decomposes into one "linking polyhedron" per block and the "aggre- 
gated polyhedron". Each of these polyhedra can be analyzed separately. 
For the case of identical coefficients in a block, we provide a complete 
description of the linking polyhedron and a polynomial-time separation 
algorithm. Applied to the knapsack with a fixed number of distinct coeffi- 
cients, this theorem provides a complete description in an extended space 
with a polynomial number of variables. 

Based on this theory, we propose a new branching scheme that analyzes 
the problem structure. It is designed to be applied in those subproblems 
of hard integer programs where LP-based techniques do not provide good 
branching decisions. Preliminary computational experiments show that it 
is successful for some benchmark problems of multi-knapsack type. 

1 Introduction 

Extreme representations of the feasible points of a mixed-integer linear opti- 
mization problem are either given by means of the facet defining inequalities in 
the original space or by a set of feasible mixed integer points whose convex hull 
contains the feasible region. It is well known that in principle one such extreme 
representation can be transformed into the other extreme representation. How- 
ever from an algorithmic point of view both extreme representations are very 
hard to achieve. 

This suggests to search for other, "intermediate" representations that are 
algorithmically more tractable, in the sense that they 
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• require less variables than the extreme representation by the vertices, 

• require less constraints compared to the total number of facets of the 
convex hull, 

• have a simpler combinatorial constraint structure than the facets of the 
convex hull in the original space and hence, the separation problem in the 
extended space is easier to solve. 

Intermediate representations of the feasible region are complete descriptions of 
an extended formulation of the original problem. To make this notion precise, 
we define: 

Definition 1 (Representation by projection). Let P C R n , P' C R d be 

two rational polyhedra and B e Q nxd a rational matrix. We call P' PI Z d a 
representation of P n Z™ if the following two properties hold: 

(a) P n Z" = { x e Z n : x = By, y e P' n Z rf }. 

(b) conv(P n Z") = {x6R":x = 5y, yeF}. 

Such a representation is called extreme if either d = n and S = J or if P' = 
{y€ R^ : $Z i=1 J/j = 1 }; otherwise, it is called intermediate. 

We remark that R. K. Martin [16] calls the sets P'nZ d and PnZ n "strongly 
equivalent" in this situation. 

In the literature, there are a couple of interesting examples of this type. 
Chopra and Rao [7, 8] introduced a directed formulation for the Steiner tree 
problem and showed that exponentially many inequalities in the undirected for- 
mulation are projections of a small number of directed inequalities. R. K. Mar- 
tin [17] reports on the minimum spanning tree problem, which has as an inequal- 
ity formulation of size 0(2 n ). It can, however, alternatively be described as the 
projection of an extended formulation which requires 0(n 3 ) variables and 0(n 2 ) 
constraints. Moreover, there are many further compact extended formulations 
for specific combinatorial optimization problems, in particular for lot-sizing and 
fixed-charge network problems; see, for instance, [13, 18, 16]. 

Next we illustrate on an example that also quite general problems such as 
knapsack problems can sometimes be described in an extended space such that 
the higher dimensional polyhedron is much more appealing than the original 
facet description. 

Example 2. Consider the set of x 6 {0, l} 8 such that 

8xo — xi — 2x2 — 3^3 — 4x4 — 5x 5 — 6xe — 7x7 < 0. (1) 
The convex hull of solutions to this knapsack problems is given by the following system 
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of thirteen inequalities: 
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One way to obtain an extended formulation for (1) is to introduce two new variables 
for the subsets {1,2} and {3,4}. This requires to introduce two new variables X{ 12 } 
and £{3,4} which are equal to one if both elements 1 and 2 (3 and 4, respectively) are 
selected. This yields the following reformulation: 

8^0 — Xl — 2X2 — 3X3 — 4X4 — 5X5 — 6X6 — 7X7 — 3X{ li2 } — 7X{ 3j 4} < 
Xl + X 2 + X{i i2 } < 1 

X 3 + X 4 + £{3,4} < 1 

The convex hull of all feasible binary solutions to this system is given by the following 
list of nine inequalities: 
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Note that not only the number of inequalities for the extended formulation is smaller 
than in the original space. More importantly, the structure of the inequalities in the 
extended space is significantly nicer when compared to the structure of the inequalities 
in the original space. For instance, the maximum coefficient occuring in the inequalities 
in the higher dimensional space is 2, whereas the highest coefficient in the inequalities 
in the original space is already 5. 

In the example, the extended formulation that we propose is based on intro- 
ducing two new variables that correspond to products of variables in the original 
space. This is a special case of an extended formulation that one can obtain 
from the so-called Lift-and-Project approach. This approach has its roots in the 
work of Egon Balas on disjunctive optimization [3, 4]. It was further refined in 
[19, 15, 5, 6] by introducing hierarchies of extended formulations whose variables 
represent more general subsets of original variables. The disadvantage of this 
approach is that the number of variables grows exponentially with the size of 
the subsets for which we introduce new variables. 

The tool that we propose in this paper to generate an extended formulation 
is the value-disjunction procedure. It is another generalization of introducing 
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one new variable for each pair of original variables. However, it also applies to 
subsets of original variables of larger cardinality and offers lots of freedom in 
generating the extended formulation. It is a general way to produce intermediate 
representations for mixed integer optimization problems. In fact, it provides a 
hierarchy of new formulations. Specifically, for any subset of original variables 
we can always introduce an extended formulation that keeps the number of new 
variables linear in the size of the subset. 

We introduce the value disjunction procedure in Section 2. We then describe 
the convex hull of the given mixed-integer set as the intersection of several 
simpler polyhedra using the variables of the extended space. This is the structure 
theorem for the value disjunction procedure. In Section 3 we introduce the 
family of linking polyhedra. In the special but important case that such a linking 
polyhedron comes from the unweighted sum of a set of variables, we completely 
describe the polyhedron by means of linear inequalities and equations. As an 
application of the structure theorem in Section 2 together with the polyhedral 
characterizations of Section 3, we are able to determine an explicit description 
of the convex hull of all solutions to a 0/1 knapsack problem with only a fixed 
number of different weights. This is the topic of Section 4. 

Finally, in Section 5, we investigate one way of making computational use 
of value disjunctions: By branching also on the new binary variables of the 
extended formulation instead of only on the original variables, it is possible 
to take more flexible branching decisions. In fact, we propose such a branching 
scheme for situations where none of the usual LP-based variable selection criteria 
provides a solid basis for taking a branching decision. Such situations frequently 
occur in very hard integer programs like the market-split instances [10]. We 
investigate the effect of branching simplifying the facet description: A branching 
decision is considered good if the facet descriptions of the generated subproblems 
are significantly simplier than the original facet description. Using experiments 
with randomly generated problem instances, we show that it is possible to make 
a branching decision based on the structure of the problem which is better than 
branching on the original variables. Finally we report on simple computational 
experiments with a few hard integer programs, where we branch explicitly on 
the new binary variables and then solve the subproblems with the branch-and- 
cut system CPLEX. We obtain a significant reduction in both the number of 
nodes and the computation time. 

2 Value disjunctions 

In this section, we present a structural result about an extended formulation 
of a given mixed integer programming model. To this end, consider a bounded 
mixed-integer set of the form 



where A h G, G R m for all j, b G R"\ and u G Z". We set P = conv T. 

Let us partition the set N = {1, . . . , n} into subsets N\, . . . , Nr. For each 
of the sets Ni, we determine all the possible vectors ("values") generated by the 




n d 
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columns Aj belonging to the variables indexed by Nf. 

Ai = < ^ A 3 x i '■ x 3 G {0, • • • , uj} for j e Ni i. 

Since the integer variables are assumed to be bounded, the set Ai is finite; 
its cardinality rii = \Ai\ is at most IX/gN^l + u j)- Let the elements of Ai 
be numbered, Ai = {ffS • • • , We shall associate with ff^* a new binary 

variable vi?'- I n order to simplify the subsequent expositions, we shall also 
use the abbreviating notations ^(x^) = X^gJV an< ^ moreover ^(y^*) = 

EfcLi V?'tp and A(y) = £^ A(y»*). 

We come to two major definitions that we make use of in this paper. 

Definition 3. For a given subset Ni, we define the linking polyhedron as 

i N. 

Vk 



Vi = convj (** y*) G x {0,1}- : £ A jXj £tf 



j£Ni k=l 
rii 

fe=i 

< a;, < Ui, i = 1,.. . ,n }. (2) 



Furthermore we define the aggregated polyhedron as 
Q = convj (y,w) e {0, 1}" 1+ '+" K x R* : 



K ni d 
i=l fe=l j=l 

J2vk* < lforaUi = l,...,A- 



(3) 



Thus, for every value in a set Ai we are introducing a new binary vari- 
able yf?'. With this family of new variables, we can obtain a new, extended 
formulation of T by linking the original variables Xj with the new "value vari- 
ables" y™* . The precise link between the extended formulation and the original 
formulation is given in the following theorem. Before stating the theorem we 
illustrate our constructions on an example. 

Example 4. Consider the convex hull P of all binary solutions to the inequality 

3xi + 3a;2 + 3a;3 + ?>X4, + 4x 5 + 7xe + 8x7 + 9x$ + 13xg + 15xio < 45. 
We then introduce the subsets 

Aft = {1,2, 3, 4}, iV 2 = {5}, iV 7 = {10}. 
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We define 



Vi = convj (x, y^ 1 ) G Z + x {0, l} 4 : 3xi + 3x 2 + 3x 3 + 3x 4 = 

3^+6^+9^+12^ 

0<Xi < 1, i = 1,...,4 }. (4) 

Since V2, ■ ■ ■ , V7 consist of single points each, these polyhedra are trivial. No additional 
y-variables are needed. Then, Q becomes 



'{(y 



Vl x 5 , . . . , x 10 ) e {0, l} 10 : 3y^ + 6y^ + 9y : f 1 + 12«f 1 + 4x 5 

+ 7^6 + 8x7 + 9xg + 13^9 + 15a;io < 45 

y^+^+yf+yf 1 ^!, 

a* 6 {0,1} for i = 5,... ,10 }. (5) 

In this example, there are several other ways to define an extended formulation based 
on introducing new variables for the values that xi +X2 + X3 +24 can attain. One could 
introduce one particular integer variable 2 that represents the value of xi +X2 +X3 + X4. 
Alternatively, one could introduce a binary expansion for the values of x\ +X2+X3+X4,, 
i.e., one introduces binary variables £1,22,23 and requires that xi + X2 + £3 + X4 = 
z\ + 2z2 + 423. For each of these models we compute the facet description of the 
corresponding convex hull, as indicated in Table 1. 

In the original formulation there are 328 facets needed to describe the polyhedron. 
If we introduce one additional integer variable that encodes the value of the constraint 
xi + X2 + X3 + X4, then the same number of inequalities suffice to describe the corre- 
sponding convex hull of solutions. This is geometrically clear because every inequality 
of the original formulation is in bijection with an inequality in the lifted space. How- 
ever, introducing three new binary variables 21,22,23 and encoding the values of the 
partial constraint xi + 22 + 23 + £4 through the additional three variables 2° 21, 2 1 z 2 , 
2 2 23, we obtain a polyhedron in the 13-dimensional space that requires 217 facets for 
a complete description. The value disjunction based on xi + X2 + 23 + X4 requires to 
introduce four new binary variables that are linked to the original variables by the two 
constraints 

21 + 22 + 23 + 24 < 1, X\ + X2 + £3 + X4 = 2i + 222 + 323 + 424. 

This new formulation in the 14-dimensional space requires only 77 facets for a complete 
description. 



Tabic 1: Sizes of facet descriptions of various reformulations 

Formulation Equations # Facets 

original 328 

integer expansion x\ + x 2 + £3 + 2:4 = z 328 

binary expansion x\ + x 2 + £3 + 2:4 = z\ + 2z% + 4^3 217 

value disjunction x\ + x 2 + £3 + ^4 = Z\ + 2z 2 + 3^3 + 4^4 77 

z\ + z 2 + z 3 + z 4 < 1 
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Theorem 5 (Structure Theorem for Value Disjunction). 

P = | (x,w) € R" x R d : there exists y G [0, with 

(y,w) € g and (x^y^) G for alii}. 



(G) 



Proof. The inclusion G is trivial. Wc shall prove the inclusion D . Let us consider 
(x, w) from the set in the right-hand side of (6). We try to prove that (x, w) G P. 
For such an (x, w), we know that there exists y such that (x , y *) G Vi. 
Therefore there exist convex multipliers A^ > with J2i=i* Ni ' 1 = 1 such 
that 

Li 

(7) 

i=i 



where (pc Ni ',y Ni ' 1 ) is an integral element of Vi and A(y Ni ' 1 ) = ^(x^''). In 
particular the y-part is made of exactly one 1-entry. Therefore 



l£T(Ni,t) 



(8) 



being a packing of {1, . . . , L, }, namely for 



with the sets T(N U t),t = l,..., 
all i we have 

{!,..., L i } = T{N i ,\)U...UT{Ni,n i ) 



(9) 



where C = AuB means C = iUB and A n B = 0. The insight of (8) is shown 
in Figure 1. 



/ y" 1 \ 



,Ni 



/ A^*' + • • • + A^>' \ 



Figure 1: Each y is equal to the sum of zero, one or more A from the convex 
combination. 

Up to now we have used the fact that (x. Ni ,y Ni ) G V,. We also have a second 
condition stating that (y, w) G Q. Therefore there exist convex multipliers o> > 
with Y)^—i o~ r = 1 such that 



y = a r y r and w = o T \ 



(10) 



where 



-Nt ,r 



,y 



N K ,r\ 
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and where y Ni > r is a unit vector. Furthermore 

f>(y^) + E G ^ b - 

1=1 j=i 

We are now able to express (x, w) as a convex combination of feasible so- 
lutions of Ax + Gw < b, using the convex combinations (10) and (7). To do 
this, we first remark that, similarly to (8), we can express y in terms of ay only, 
namely 

v^= E ( n ) 

ses(Ni,t) 

with the sets S(Ni, i), t — 1, . . . , rij being a packing of {1, ... , R}, namely 

{1, . . . , R} = S(N h 1) U . . . U S(N h Ti,), (12) 
for all i. By using (8), we therefore conclude that 

E E AAW ( 13 ) 

s£S(Ni,t) lGT{Ni,t) 

By using the similarity of decompositions (11) and (8), we can construct the 
desired convex combination as follows. 

Let us fix r, i.e., we consider each pair (ay,y r ) separately. We know that 
y r is divided into K blocks with a unit vector in each block. In the block Ni, 
we refer to the index of the non-zero component of y r as c(y Ni ' r ). Using (8), 
we can associate to c(y Ni,r ) a set T(Ni, c(y Ni ' r )) of indices I, which correspond 
to multipliers A *' and vectors x Ni > 1 of the convex combination (7). For every 
possible choice of indices 

l\ G T(N±, c(y Nl ' r )), V K eT{N K ,c{y NK ni 
we consider the point 

x(l[,... ,!£■)= (x^'v - ,R N «> 1 «) 
with a corresponding coefficient 

\Ni,ll \N K ,l r K 

v(K,...,l r K ) = a r — ; (14) 

lGT{N u c(y N i^)) leT(N K ,c(y NK ' r )) 

First we can see that for all l\, . . . , l r K , the vector (x(/[, . . . , l r K ), w r ) satisfies 
Ax(l\, . . . ,l r K ) + Gw r < b. Indeed, 

Ax(ll, ...,l r K ) + Gw r = A(x Nl ^) + • ■ ■ + A(x N *> r x) + Gw r 
= A(y Nl ' l i) + ■■■ + A{y N «' l «) + Gw r 
= A(y r ) + Gw r 
< b, 
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since (y r , w r ) is a mixed-0/1 solution of Q. It now suffices to prove that x is the 
convex combination of all the x(ZJ, . . . , l r K ) using the corresponding coefficients 
■ ■ ■ , l K )- Let us fix Ni and an index j € N. We have 



R 

x 



f = E E ■•• E ^•••^fw,---,^) 
= E E ••• E ^,...,^f' r 

r=l /5;eT(JV 1 ,c(y«i-'')) iS.eT(JV K ,c(y JV K.'-)) 

= E E ^ — ^ Tin*?'"' (15) 



r=l l^T(Ni,c(y N ^ r )) 

l£T(N t ,c(y N i-n) 

the last identity being obtained using (14). For a fixed i, we have, using (12), 

{l,...,R} = S(N i ,l)U...US(N i ,n i ). 

Therefore we can rewrite (15) using indices running over the different S(Ni, k). 
Remark also that when we fix r £ S(Ni,k), we have c(y Ni ' r ) = k. We hence 
have 

ni \Ni,l 
Ni \ " \ ■* \ ■* A _ N I 

x i = E E E ^ v ,iv„ g ^ 

k=lpeS(Ni,k) l£T(Ni,k) 

qeT(Ni,k) 



E ct p 

Ey^ pgS(JVi,fc) Ni i_N z ,l 



k=l l£T(Nt,k) 

qeT(Ni,k) 

= E E ^f<\ (i6) 

fe=l l£T(N it k) 

where (16) is obtained using (13). We can use (9) namely 
r(JV i ,l)d...UT(JV i ,n i ) = {l,...,L i }. 

In particular it allows us to sum over {1, . . . , L,} in (16) instead of the summa- 
tion over k and I. We therefore finally have 



i=i 



which is the desired result using (7). Finally, the sum of the v coefficients is 
equal to 1 due to their const] 

Example 6. Consider the set 



equal to 1 due to their construction and the fact that Y^r=i °V = 1- D 



T = {x G {0, 1, 2} 4 : xi + x 2 + 1x z + 3x 4 < 7}. 
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Table 2: The complete description of Example 6 in the original space 
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< 4 











-1 


< o 
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< 5 


1 











< 2 
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< 6 





1 








< 2 


1 
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< 6 








1 





< 2 


1 


1 


2 


3 


< 7 



The complete facet description of conv T is given by the 14 inequalities c x < 7 shown 
in Table 2. 

We now construct a value disjunction of the set T . To do this, we consider three 
blocks N\ — {1,2}, N2 = {3}, N4 = {4}. In block Ni we consider the linear form 
x\ + X2, which can take the values 0, 1, . . . , 4 because xi and X2 have an upper bound 
of 2. We introduce thus four variables j/i , j/2 , J/3 , J/4 corresponding to the four nonzero 
values. The blocks N2 and ^3 are trivial, so we do not need to introduce new variables 
in those cases. A valid formulation for T is thus 

T = Proj x { (x, y) £ {0, 1, 2} 4 x {0, l} 4 : yi + 2y 2 + 3j/ 3 + 4j/ 4 + 2x 3 + 3:r 4 < 7 

Xl + X 2 = Vl + 2t/ 2 + 3j/3 + 4j/4 
Vl + V2 + 2/3 + V4 < 1 } • 

Theorem 5 now asserts that we obtain the complete description of the extended for- 
mulation of T by combining the complete descriptions of the polyhedra 

Vi = conv{ (xi,x 2 ,y) £ {0, 1, 2} 2 x {0, l} 4 : xi + x 2 = yi + 2y 2 + 3y 3 + 4y 4 

yi + y2 + y?, + j/4 < 1 }, 

and 

Q = conv{ (x 3 , xt, y) £ {0, 1, 2} 2 x {0, l} 4 : 2a; 3 + 3x 4 + j/i + 2y 2 + 3j/ 3 + 4j/ 4 < 7 

2/1 + J/2 + 2/3 + J/4 < 1 }• 

We obtain the facet description given by the inequalities c T x + d T y < 7 shown in 
Table 3. For each non-trivial inequality, we also mention whether it comes from Vi or 
from Q. 

In the example it turns out that the number of inequalities describing conv T 
is the same in the two representations. This, however, is not always true. More- 
over, an inherent advantage of the second formulation over the first formulation 
is that its structure is better known. In particular, it may occur that the same 
polyhedron Vi appears in several different problems. In this case, the knowledge 
about the description of the polyhedron Vi can be used over and over again. 

The next section presents the case of a polyhedron that appears often in our 
experiments, namely the Vi polyhedron where all the coefficients of the variables 
x are the same. We show that we can compute a full description for this object. 
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Table 3: The complete description of Example 6 in the extended space 



Cl C2 


C3 


C4 
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d 2 


d 3 
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-1 


< o 




-1 
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< o 


Vx 


-1 
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< o 


Vi 
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< 1 


Q,Vi 




1 
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< 2 


Q 






1 
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< 2 


Q 




1 


1 


1 


1 


1 


2 


< 3 


Q 




1 


2 




1 


2 


2 


< 4 


Q 


1 1 






-1 


-2 


-3 


-4 


= 


Vi 



3 A special family of linking polyhedra 

In this section we study the linking polyhedra Vi for the case where the columns 
Aj for j £ Ni are identical and the variables Xj are binary. In other words, we 
study the polytope 

m 

Vi = conv{(x JV % y Ar G {0, x {0, 1}"' : $>i = E ^ 

jeNi k=l 

fe=i 

We are able to give a complete description of this polytope V%. 
Theorem 7. Vi is a polytope whose affine hull is given by the equation: 

7li 

E^=E^ (17a) 

j£Ni k=l 

The facets of Vi are given by: 
_ |T| 

E^-E*^* - E \ T \yk<0 forHi^T cN, t (17b) 

jeT k=i fc=|r|+i 

E^ 1 ^ 1 ( 17c ) 

fc=i 

yf->0 fork = l,...,m. (17d) 

Proof. We first show that the inequalities (17) are valid for V,. To this end, let 
(x,y) e {0, l}!^*! x {0, 1}™> be a vertex of V. If y = 0, then also x = 0, and 
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inequality (17b) is trivially satisfied. Otherwise, y = e fc with k = J^jeN- x i = 
| suppx^ |. Let ^ T C Ni be arbitrary. If k < \T\, we have 

\t\ m 
Y x i~ Y kyk ~ Y l T l^ fe = Y x i ~ k - °- 

j&T k=l fe=|T| + l j£T 

On the other hand, if k > \T\, we have 

\T\ m 

J2xj-J2 k yk- Y l r l» = - l?l < o. 

jeT fe=l k=\T\ + l j£T\ 

Hence, (17b) is satisfied. The remaining inequalities are trivially valid for V*. 

For the ease of notation we let N = Ni, n — \N\ and substitute the variables 
y, j by simply y^. Let c T x + d T y < 7 be a facet-defining inequality of Vi and 

set 

F-{(x,y)e^:c T x + d T y- 7 }. 

We will show that c T x + d T y < 7 corresponds to one of the inequalities in (17) 
up to multiplication by a scalar. We assume that the variables in TV are reordered 
such that ci > C2 > . . . > c„. Since Vi is not full dimensional, we first transform 
c T x + d T y < 7 into a standard form. This can be achieved by adding multiples 
of the equation (17a) to c T x + d T y < 7. More precisely, we first proceed with 
the following two steps. 

(1) While there exists an index % £ N such that Cj < 0, add — a times Equa- 
tion (17a) to the inequality c T x + d T y < 7. Let us again denote by 
c T x + d T y < 7 the resulting inequality. Notice that after terminating 
with Step 1, we have that c, > for all i G N and c„ = 0. 

(2) If Ci > for all i e N and there exist i,j G N such that Cj ^ Cj, then 
ci > c n > due to our reordering. In this case we subtract c n times 
Equation (17a) from the inequality c T x + d T y < 7. Notice that also after 
Step (2) has been performed we have that c„ = and q > for all i £ N. 

The preprocessing steps (1) and (2) guarantee that Cj > for all i G N. Now 
let s G {0, . . . , n} be an index such that 

ci > c 2 > . . . > c s > = c s+ i = . . . = c n . 

We define T — {i G iV : c, > 0} = {1, . . . , s}. We consider the following cases. 

Case 1. If T = 0, i.e., ci = • • • = c n = 0, it follows that c T x + d T y < 7 
is a multiple of the inequality 53fe=i — 1 or °f the non-negativity 
constraints > 0. 

Indeed, because (0, 0) is feasible, we have 7 > 0. Since F is a facet, 
there must be 2n — 1 affinely independent feasible points on it. If 
7 = 0, we have (0,0) G F; therefore, for all but one k — 1, . . . ,n, a 
point (x, e k ) must be contained in F. This means that dk = 7 = for 
all but one k — 1, . . . , n. For the remaining one fc G {1, . . . , n} we have 
rffe < 7 = 0, so c T x + d T y < 7 is a scalar multiple of the non-negativity 
constraint y~ k > 0. 
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On the other hand, if 7 > 0, then (0, 0) ^ F, so we have FC{(x,y)e 
Vi : Efc=i Vk — 1 }i since (0, 0) is the only feasible integer point with 
y = 0. Because F is a facet, we have F = { (x, y) G Vi : EaUi — 1 }j 
which corresponds to (17c). 

Case 2. If 7 1 = N, we conclude from our previous analysis that a — c } ■ ^ for 
all z,j G A 7 ". It follows that c T x+d T y < 7 is implied by Equation (17a), 
a contradiction that F defines a facet of Vi . 

Case 3. Therefore, we may assume that ^ T C iV, T ^ N . Again, since (0, 0) 
is feasible, we have that 7 > 0. If 7 > 0, then F C {(x,y) e ^ : 
Efc=i Vk = !}■ Hence, we can assume that 7 = 0. 
We next define indices 1 < i\ < i% < . . . < i r < s as follows: 

Cl = . . . = C n > Cj 1+ l = . . . = c; 2 > . . . > c ir + i = . . . = c s . 

By testing the inequality c T x + d T y < with the feasible points 
(e 1 , e 1 ), (e 1 + e 2 , e 2 ), (e 1 + e 2 + e 3 , e 3 ), . . . , we conclude that 

—di > ci 
—<h > ci + c 2 

~d il > ci + c 2 + . . . + Cjj 

-dii+2 > EjLl c i + c *i + l + c H+2 

— d< a > Ej = l C 3 + C H+ 1 + Ci i+2 + • • • + Cj 2 

-dj r +i > E£=i c i + c v+i 

-cf. lr+2 > Ej = l c j + c i r + l + Cir+2 

— d a > E}=i c i + c v+i + c i,-+2 + . . . + c g 

Therefore, the inequality c T x + d T y < 7 = is dominated by the 
following conic combination of the inequalities (17b): 

c *r x f EiU x i - ELi k vk - ELU+i s vk < 

+ (c lr - Ci r -i) X ^E^Ll Xi - Efe=l kj/fc - £k=*,+l irVk < 

+ (cii - Ci a ) x ( E^Li - ELi %k - EL 1+ i hvk < 
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This completes the proof. 



□ 



Theorem 8. The separation problem over the linking polyhedron Vt in the case 
of identical coefficients can be solved in polynomial time. 

Proof. Let (x*, y*) be a point satisfying the polynomially many constraints (17a, 
17c, 17d). We show that, in polynomial time, we can decide whether (x*,y*) 
satisfies the exponentially many inequalities (17b); if it does not, we can con- 
struct a maximally violated inequality. 

It is clear that among the inequalities (17b) with equal cardinality \T\ = s, 
a most violated inequality is the one where T is the index set of the s largest 
components x*. Therefore it suffices to sort the variables x* , . . . , xf N , such that 

x* > x* 2 > ■ ■ ■ > x* > = x s+ i = ■ ■ ■ = xT N .\. 

Then we can simply evaluate the violation of inequality (17b) for the sets {1}, 
{1, 2}, {1, 2,3}, . . . , {1, . . . , s} and pick the set which yields the maximal viola- 
tion. □ 



4 An application: The knapsack with three dis- 
tinct coefficients 

In this section, we show that the value disjunction procedure is a tool to compute 
complete descriptions in an extended space. As an example we consider the 0/1 
knapsack problem with three distinct coefficients: 

jeNi jeN 2 j£N 3 

where N%, N2, N3 are pairwise disjoint index sets. The convex hull of the fea- 
sible solutions can have exponentially many vertices and facets. Moreover, the 
complete facet description for (18) is not known in general. In [20], the case of 
the knapsack with two different coefficients was solved. By applying the struc- 
ture theorem for value disjunctions (Theorem 5), we are able to give a complete 
description for an extended formulation of (18) using only polynomially many 
variables. 

We consider the extended formulation of (18), 

jeNi jeN 2 jeN 3 

j^Ni k=l 
|JVi| 



for i =1,2,3 
for i = 1,2,3 

for i = 1,2,3. 
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Theorem 5 provides us the framework to describe the convex hull of such an 
extended formulation. It is given by the intersection of the linking polyhedron 
and the aggregated polyhedron. The linking polyhedron was studied in the 
last section. Theorem 7 gives a complete facet description of it. Concerning 
the aggregated polyhedron, we will make use of a vertex description. It is the 
convex hull of the set described by 

\N X \ \N a \ \N 3 \ 

A* £ ky N - k + A £ ky^ k + a £ ky N - k < (3 

k=l k=l k=l 

\Ni\ 



Y,y N " k <l fori = 1,2,3 



k=l 



y Ni £ {0,l} |Ar *! fori = 1,2, 3. 



Clearly there are at most (1 + |JVi|) • (1 + \N 2 \) ■ (1 + \N 3 \) vertices. We denote 
them by v 1 , e {0,1)1^1+1^1+1^1. 

Theorem 9. The complete facet description of (18) in an extended space is 
given by: 



p 
p 

£^ = i 

Zj>0 for j = l,...,p 

m 

Y,xf=H k y Nuk fori = 1,2,3 

jeNi k=l 

J2 x f z - J2 {\T\ + k-m)y Ni > k for i = 1,2,3 and ^ T c N t 

jeT fee{l,...,n«}: 
T|+fc>ri ; 

X G Rl W i 1 + 1^2 1 + 1^31 

y e rI^iI+I^I+I^ 
z e R p . 

Proof. This follows from Theorem 5. □ 

It is straightforward to extend our construction to binary integer programs 
with a fixed number of different columns. 



5 Branching on value disjunctions 

So far we have presented the value disjunction technique as a theoretical tool to 
define extended formulations which may yield more tractable polyhedral descrip- 
tions. Clearly it would be too much to expect general results on the existence 
or constructability of an intermediate representation for an arbitrary integer 
program that is better than the original formulation. The more modest goal 
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of this section is to provide evidence for the practical usefulness of the value 
disjunction technique, using a limited set of computational experiments. 

We shall restrict ourselves to experiments where we perform branching on 
the new binary variables of the extended formulation. We first need to discuss 
the situations for which we propose to make use of our new technique, so as to 
complement the existing branch-and-cut techniques. 

On the simplification effect of branching. Today mixed integer linear 
programs are solved using branch-and-cut algorithms, i.e., such an algorithm 
consists of two phases, the cutting phase with the objective to tighten a current 
formulation and a branching phase. However as of today there are essentially 
no mathematical arguments available that help to decide when it is more ef- 
ficient to branch or to cut. This question is fundamental since computational 
experiments clearly reveal that neither a pure branch-and-bound algorithm nor 
a pure cutting plane algorithm can solve the instances that the combination of 
the two can manage to solve. One partial answer to this question is given by the 
fact that branching does not only generate subproblems with less variables, but, 
more importantly, the polyhedral description of each of the two subproblems is 
significantly easier than the original facet description. We illustrate this point 
through an example. 

Example 10. We consider the feasible region 

7xi + 5X2 — X:j — X4, — 2xs — 3X(j — 4^7 — 6xs < 1 

x l £ {0,1}. 

The non-trivial facets of the convex hull are shown in Table 4. If we consider the four 
subproblems where the variables x-j and xs are fixed to the possible values, we obtain 
much simpler facet descriptions; see Table 5. 

This example illustrates why branching is such an important tool in solving 
mixed integer programs. The question emerges how to obtain branching deci- 
sions such that the polyhedral description for each of the branches becomes as 
easy as possible. Thus, when we compare branching decisions in our experi- 
ments, we shall use the following definition. 

Definition 11. The complete description size of an n-way branching decision 
is defined as the sum of the numbers of facets in the complete descriptions of 
the n subproblems. 

Clearly this definition should only be used for comparing branching decisions 
with an equal number of subproblems. For our experiments, we used PORTA 
[9], version 1.3, to enumerate the feasible solutions and to compute the facet 
description of their convex hull. As the computation times for problems of higher 
dimension would be prohibitive, we had to restrict ourselves to experiments with 
very small integer programs. Specifically, we generated dense 0/1 problems with 
twelve binary variables and two rows. The four test instances are shown in 
Table 6. 

On the limitations of current LP-based branching schemes. A single- 
variable branching scheme, which is used in today's branch-and-cut systems, is 
usually driven by information obtained from the current LP relaxation ( "most 
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Table 4: Full description of Example 10 
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Table 5: Full description of the subproblems of Example 10 
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Table 6: Randomly generated problem instances. Instances 1 and 2 have been 
generated randomly by drawing the coefficients independently and uniformly 
from the set {— 20, . . . , +20}. The right-hand side is always 0. Instance 3 
has been modified manually, so that the first three variables have identical 
coefficients. Finally, instance 4 is a variation of instance 3 where the coefficients 
of the first three variables are very close to each other. 
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infcasible variable selection"), by lookahead-based techniques ("strong branch- 
ing"), and history-based prediction ("pseudo-cost branching"). There is a large 
class of problems that are extremely difficult to solve for current branch-and- 
cut systems because none of the above criteria provides a meaningful basis for a 
branching decision. An extreme example for this are the market split instances 
by Cornuejols and Dawande [10]: Here the LP relaxations of all subproblems 
have the value 0, until most of the variables have already been fixed. How- 
ever, it was shown that branch-and-bound is indeed the right tool for solving 
the market split instances: While LP-based single-variable branching fails, it is 
very successful to branch on certain general disjunctions that can be derived 
from the problem structure via lattice basis reduction [2]. Though this tech- 
nique has proved very successful for solving market split problems [1] and also 
for the so-called banker's problem [14], it has not become a general tool for 
branch-and-cut systems. 

We also refer to the recent work [12] where a branching method along general 
disjunctions is proposed. Here the quality of a disjunction (branching direction) 
is measured by the depth of the intersection cut corresponding to the disjunc- 
tion; among the best disjunctions, strong branching is used to select one. The 
computational results for many benchmark problems from MIPLIB are very 
promising. However, for a few instances the proposed branching scheme fails 
to close any gap. This includes the market split instances marksharel and 
markshare2. 

A branching scheme based on value disjunctions. We propose a new 
branching scheme based on value disjunctions, which we hope is general enough 
to be useful as a branching scheme for general integer programs. It is purely 
based on the analysis of the structure of the integer program, and is designed 
to complement the above mentioned LP-based prediction methods. 

The basic idea of the new branching scheme is to partition the set N of 
problem variables into blocks N% and to move over to the extended formulation 
given by the value disjunction. In addition to the original variables, we can then 
branch on the newly introduced binary variables. In fact, because exactly one 
binary variable of each block can be set to 1, we can perform SOS branching on 
these variables. The question, of course, is how to construct a suitable partition 
of N. 

Claim 1. One should choose a set of variables whose columns are structurally 
similar and perform a value disjunction according to a relaxation where we re- 
place the original coefficients by simpler ones. 

For our experiment, we decided to pick three of the twelve binary variables, 
Xi, Xj, Xk, say. We then add the (redundant) constraint Xi + Xj + Xk < 3. 
When we construct a value disjunction with respect to this constraint, we need 
to introduce four variables yo, yi, j/2, 2/3, corresponding to the possible values 
of the form Xi + Xj + Xk- Performing SOS branching on j/o + 2/1 + 2/2 + 2/3 = 1 
yields four subproblems. To compute the complete description size of the value 
disjunction branching on Xi, Xj, Xk, we sum up the numbers of facets in each of 
these four subproblems. To make a comparison with traditional single- variable 
branching, we need to consider a branching strategy that yields the same number 
of subproblems. To this end, we pick two original variables, Xi, Xj say, and 
consider the subproblems where we fix these variables to the possible values. 
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We next defined a "ranking formula" for the selection of the three variables 
Xi, Xj, Xk that give rise to the value disjunction. Let Ai, Aj, A k denote the 
columns of these variables. Then let 

\2 



R({h3, k }) = min 



2 (max{A r> i,A rtj , A Ttk } - mm.{A r ^,A r ^, A Tyk }Y 



2+ |med{A r ,i, A r j, A r>k }\ 



where med{A r! i, A r j, A r ^} denotes the median of the three values. The formula 
was designed so that (i) columns that have "similar" coefficients in at least one of 
the rows yield a low (good) result; (ii) columns with large coefficients yield a low 
result. The rationale of this ranking is that, intuitively, the value disjunction for 
a selection of similar columns should lead to simpler subproblems; also columns 
with large coefficients should have a larger impact on the rest of the problem 
than columns with small coefficients. 

Example 12. For test instance 4, selecting the variables xi, x%, x$ has the rank 
7?({1,2,3}) = 0.083; selecting the variables xr, xg, x^o has the rank R({7, 9, 10}) = 
108. 

For all possible branching decisions (i.e., the (g 2 ) choices of three variables), 
we now computed the rank and the complete description size. We grouped the 
branching decisions according to their rank into sets of the 5 best ranked, 10 % 
best ranked, 30 % best ranked, etc. choices. For each of the test instances, we 
show histograms of the complete description sizes corresponding to branching 
decisions within these rankings in Figures 2-5. As a comparison, the bottom 
part in each figure shows a histogram of the complete description sizes obtained 
by the ( 2 ) possible choices for two-variable branching. In each histogram the 
vertical line shows the average (arithmetic mean) of the complete description 
sizes. 

From the computational results, we can draw the following conclusions: 

1. ft is possible to use the rank formula to predict which branching decisions 
will lead to low complete description sizes. 

2. For instances 1 and 2 that do not contain selections of very low rank, two- 
variable branching performs better than branching of value disjunctions. 
However, instances 3 and 4 that contain selections of very low rank, it 
is possible to take branching decisions that are better than two-variable 
branching decisions by making use of the rank formula. 

We have to remark that there is room for improvement of the proposed 
ranking formula. Clearly it needs to be generalized for blocks of different cardi- 
nalities. It would also need adjustment for unequally scaled rows. 



Value disjunction branching on larger problems. Based on the evidence 
obtained with the above experiments, we tried to use the new branching scheme 
to solve larger test problems. Our set of test instances consists of instances with 
several dense rows (multi- knapsack problems). We focused on problems where 
the solutions to LP relaxations of subproblcm only give little information for 
taking branching decisions. The test instances are: 

• Six randomly generated market split instances with 35 and 40 variables. 
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Figure 2: Branching on value disjunctions vs. 2-variable branching (instance 1). 
The hgure shows histograms of the total number of facets in the subproblems; 
the vertical line is the average. 




Figure 3: Branching on value disjunctions vs. 2-variable branching (instance 2) 
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ure 4: Branching on value disjunctions vs. 2-variable branching (instance 3) 
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• The models mas74 and mas76 from the MIPLIB. 

It seems difficult to apply Theorem 5 directly to these problems. The reason 
is that typically many constraints in a model are present. In this case the 
probability that we can come up with a block decomposition such that some 
values repeat, is quite low. Hence, one may expect that in such cases the value- 
reformulation requires to introduce as many variables as we have subsets in each 
of the elements of the partition Ni, . . . , Nk- Therefore, we decided to perform 
the following steps: 

1. We consider one of the dense rows at a time. We add a relaxation of this 
row that we obtain by replacing the coefficients by simpler ones. From 
the row 

n d 

aiXi + ^2 9j w ] < b, 

i=l 3=1 

we generate the relaxation 

n 

J2 f(0i)xi < M, 

where f(x) is a non-linear function of the type 

[ 1 if x > U 
f(x) = <0 if L < x < U 
{ 1 if x < L. 

2. We reformulate the problem using a value disjunction for each of the new 
rows separately. 

3. Finally, we manually perform SOS branching on the new variables. Then 
we solve each of the sub-problems with the standard branch-and-cut system 
CPLEX 9.1 [11] using the default settings of the Callable Library. We use 
the optimal solution value from a subproblem as a primal bound for the 
remaining subproblems. 

The results of this approach on the set of test instances are shown in Table 7. It 
can be seen that the approach provides a clear gain on all these instances. Both 
the number of nodes and the computation times are reduced in comparison to 
the performance of CPLEX 9.1 (with the default settings of the Callable Library) 
on the original problem. 
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Table 7: Branching on value disjunctions for the market split and mas instances. 
Computation times are given in CPU seconds on a Sun Fire V890 with 1200 MHz 
UltraSPARC-IV processors 



Name 




Rows 


Cols 


CPLEX 9.1 


Value Disjunctions 


Nodes (10 6 ) 


Time (s) 


Nodes (10 6 ) 


Time (s) 


corn535- 


•1 


5 


40 


13.8 


2 431 


3.8 


809 


corn535- 


-2 


5 


40 


11.9 


2 084 


4.2 


865 


corn535- 


•3 


5 


40 


17 


2 946 


9.8 


1970 


corn540- 


-4 


5 


45 


321 


55 918 


105 


20 873 


corn540- 


•5 


5 


45 


231 


39 787 


87 


17 267 


corn540- 


-6 


5 


45 


188 


30 532 


97 


19162 


mas74 




13 


151 


4.4 


2 463 


1.2 


1 194 


mas76 




12 


151 


0.667 


289 


0.063 


35 
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